Assessment of Noninferiority Margins in Cardiovascular Medicine Trials

Background Noninferiority trials are increasingly common in cardiovascular medicine, but their reporting and interpretation are challenging, particularly when an absolute risk difference is used as noninferiority margin. Objectives This study aimed to investigate the effect of using absolute rather than relative noninferiority margins in cardiovascular trials. Methods We reviewed noninferiority trials presented at major cardiovascular conferences from 2015 to 2022 and published within the same period. Based on the actual versus anticipated event rates in the control group, we recalculated the absolute noninferiority margin and re-assessed the trial results. The primary outcome of interest was the proportion of trials with a different interpretation after recalculation. Additionally, we analyzed the conclusion statements of these trials to determine if cautionary notes for the interpretation of study results were included. Results We analyzed a total of 768 trials, of which 88 had a noninferiority design and 66 used an absolute noninferiority margin. Of 48 comparisons from 45 trials qualifying for the analysis, 11 (22.9%) had divergent results after recalculation of the absolute noninferiority margin based on the observed rather than anticipated event rate. Ten trials originally claiming noninferiority, did not meet it after the margin recalculation. All of them did not include statements suggesting cautionary interpretation of the study results in the conclusion section. Compared with the other trials, these displayed a larger median difference between anticipated and recalculated noninferiority margins (44.7% [IQR: 38.6%-56.7%] vs 15.3% [IQR: −1.5% to 28.9%]; P < 0.001). Conclusions Recalculating noninferiority margins based on actual event rates, rather than anticipated ones, led to different outcomes in approximately 1 out of 4 cardiovascular trials, with most divergent trials lacking cautionary interpretation. These findings emphasize the importance of using or supplementing the relative noninferiority margin, particularly in studies with significant deviations between observed and expected event rates. This underscores the critical need for enhanced methodological and reporting standards in noninferiority trials, especially those employing absolute margins.

[3] However, several types of bias in trial design and conduction may affect their results and interpretation. 4[13][14][15][16][17] These issues can be even more challenging in the setting of a noninferiority design.Experts and regulatory authorities established some critical considerations in this scenario. 182][13][14] In particular, the outcome of a noninferiority trial depends on where the CI of the effect size for a treatment or strategy lies around the noninferiority margin. 18,19erefore, the choice of the margin represents a key issue for the validity and credibility of a noninferiority trial.To establish an accurate summary estimate of the treatment effect, regulators recommend that previous studies of the active control versus placebo are evaluated and, as appropriate, the effect size is obtained by pooling the available measures with meta-analytic methods, with the final aim to preserve more than a half of the putative effect of the active control versus placebo when selecting the noninferiority margin. 20,21e noninferiority margin is typically prespecified by the investigators either as an absolute risk difference (ARD) or as a relative risk ratio (RRR) of the treatment effect.An absolute margin is preferrable to assess infrequent events or to communicate the risk to an individual patient, while relative metrics are helpful at the population level. 22,235][26][27][28][29][30] In cardiovascular medicine, where noninferiority designs are common for introducing new treatments, a study by Simonato et al. found that about a third of noninferiority trials in interventional cardiology reached different conclusions when using ARD margins compared to RRR margins.However, their focus on coronary stent trials limits broader applicability, and they did not examine how these trials were reported. 30,31r hypothesis is that noninferiority trials in this field frequently exhibit discordant outcomes based on prespecified vs recalculated ARD margins, and that many of these trials lack adequate cautionary interpretation in their conclusions when the study results are not certain due to lower-than-expected event rates.To fill this gap, we aimed at characterizing the reporting and interpretation of noninferiority trials across the broad field of cardiovascular medicine by addressing the prevalence of different types of noninferiority margin, and by evaluating the impact of using absolute rather than relative noninferiority margins in the context of lower-than-anticipated event rates.Greco et al Noninferiority margins and treatment effect sizes with corresponding CIs were also scrutinized.This study did not require ethical approval as it involved the analysis of publicly available data and did not include any direct interaction with human or animal subjects.

METHODS
ASSESSMENT OF NONINFERIORITY MARGIN.The relative frequency of using an ARD or a RRR as noninferiority margin was analyzed.Noninferiority trials using an ARD were further assessed if none of the following cases occurred: 1) the study used an ARD margin, but the results were reported using the upper confidence boundary of a RRR; and 2) the primary endpoint was reported as a continuous variable or event-free rate.To evaluate the impact of using an absolute noninferiority margin we recalculated the ARD margin based on the event rate observed at the end of the trial in the control group (ie, as opposed to the event rate anticipated by the investigators) (Supplemental Figure 1).In brief, for each ARD margin, the corresponding RRR margin was calculated as the ratio between the acceptable rate of events in the control arm (ie, anticipated event rate plus ARD margin of noninferiority) and the anticipated event rate in the control arm.1).

RESULTS
The study flow chart is shown in Figure 1   Greco et al Noninferiority Margins in Cardiovascular Trials J U L Y 2 0 2 4 : 1 0 1 0 2 1 margins, respectively; P ¼ 0.111).There were also no statistically significant differences in the publication timing (P ¼ 0.094) and in all the other variables explored (Table 1).
ASSESSMENT OF NONINFERIORITY MARGIN.After applying exclusion criteria, 48 analyses from 45 trials (ie, 3 trials reported 2 eligible analyses of co-primary endpoints) were eligible for recalculation of their noninferiority margins (Central Illustration).There were no significant differences in baseline characteristics between analyses eligible and not eligible to recalculation (Supplemental Table 3).In the eligible analyses, noninferiority was declared in 42 out of 48 (87.5%).
After the recalculation of the noninferiority margin, 31 analyses (64.6%) concluded for

DISCUSSION
The main findings of this study can be summarized Noninferiority trials have been increasingly performed over the last decades, and a rigorous methodology is of utmost importance to avoid the inappropriate adoption of a treatment that may threat the outcomes of patients. 31For instance, "biocreep" is a detrimental process that can lead to the acceptance of an inadequate treatment as a result of a stepwise sequence of noninferiority proofs entailing a gradual loss of treatment effect. 33To standardize and improve the architecture of noninferiority trials,  The OPTIMIZE IDE trial represents a notable exception in our study, showcasing a unique case where the interpretation was reversed after margin recalculation, contrasting with the direction of divergences observed in other trials with primary outcome discrepancies.In this study 1,639 patients were randomized to either a low-profile, fixed-wire drug-eluting stent or a conventional drug-eluting stent.The investigators anticipated a target lesion failure rate (primary endpoint) at 1 year of 6.5% in the control arm and defined an ARD margin of noninferiority of 3.58% (corresponding to a RRR margin of 1.55).The difference in the primary endpoint rate between the 2 treatments was 0.08%, with an upper confidence limit of 3.8%: noninferiority was not met and the trial was reported as failing the noninferiority hypothesis.However, several considerations may challenge the interpretation of this trial.Indeed, the higher-than-anticipated observed event rate in the control arm (9.5% instead of 6.5%) pushed the noninferiority margin up to 5.2%, theoretically enabling a claim of noninferiority.5][26][27][28][29][30] Regulatory authorities recommend to prioritize the use of a relative instead of an absolute margin, which is particularly desirable when there is a risk for a lower-than-anticipated event rate or the event rate is unpredictable. 20,21elative metrics are advantageous at the population level (eg, to assess the effects of a treatment compared to a previous one) and are dimensionless (ie, do not numerically increase over observation periods, therefore remaining comparable regardless of the timing of the assessment). 22,23However, since they can be clinically less meaningful, particularly in the assessment of infrequent events, absolute metrics are preferred to express and communicate risks at the individual-patient level despite being susceptible to influence from individual factors (eg, health status, risk factors, comorbidities). 22,23oosing an ARD conveys some risks of wrongly rejecting the null hypothesis (ie, inferiority of the experimental arm) if the trial terminates with a lower-than-anticipated event rate. 34This is due to a calibration inaccuracy that occurs when the calculation relies mathematically on a difference rather than on a ratio (ie, the difference is influenced by the magnitude of event rates, while the ratio is dimensionless and therefore avoids any distortion).
Although some investigations on the reporting of noninferiority trials have already been performed in other areas of medicine, they are limited in the cardiovascular field, and the accuracy of their interpretation has not been systematically investigated.In particular, the focus of our study was on interpretation and reporting of noninferiority trials in cardiovascular medicine.Recently, the reporting of noninferiority trials has been analyzed in the field of coronary interventions, where about 1 out of 3 studies claiming noninferiority based on an ARD displayed different results after recalculation. 30This is higher than in our study, which focused on a broader area of interest, including trials of drugs, devices and strategies in cardiovascular medicine.The use of different inclusion criteria is likely responsible for the lower proportion of divergent results after recalculation in our study.
Among trials with divergent results after margin recalculation in our series, one showed an inverse discordance due to a higher-than-anticipated event rate. 32This case represents an example of why the recalculation of the ARD noninferiority margin can be important also in case of higher-than-anticipated event rates.Indeed, underestimation of event rates may lead a treatment that is actually noninferior to the control to be erroneously declared inferior, not published, and eventually excluded from further clinical development.
The absence of statements of cautionary interpretation of the study results, unduly highlighting the experimental treatment as beneficial despite fragile statistically significance in the primary outcome, can present with a variety of forms, including an excess focus on statistically significant results with selective or incomplete reporting, empowerment of secondary or subgroup analyses, interpretation of statistically nonsignificant results as a proof of equivalence, or emphasis on beneficial effects despite a lack of statistical significance.Importantly, the absence of cautionary interpretation should not be necessarily considered as a fraudulent conduct since it can also arise from unconscious bias toward or against a treatment, or insufficient methodological knowledge.
Consequently, our study underscores that the noninferiority setting involves multiple considerations beyond the noninferiority margin, and it emphasizes that a rigorous interpretation of these studies requires a comprehensive analysis of all evidence, especially in cases of significant deviations between expected and observed event rates.A large investigation on the interpretation of study results has been conducted among negative or neutral oncology noninferiority trials, where it affected 3 studies out of four. 24In our study, a lack of phrasing suggesting the need for cautionary interpretation was identified in all the analyses showing divergent results after the margin recalculation.Interestingly, the high median percentage difference between the prespecified and the recalculated margins in analyses with divergent results seemed to be rooted in the variance between anticipated and observed event rates found in these studies.This highlights a growing challenge in designing randomized trials, especially in rapidly advancing fields where interventions improve quickly, thus affecting performance and reducing adverse events, which are key trial endpoints.In fact inaccuracies in study design predispose to the change of study results when the noninferiority margin is recalculated, therefore substantially increasing the likelihood of claiming noninferiority based on inappropriate assumptions. 20,21 the best of our knowledge, this is the first study to provide a comprehensive appraisal of critical issues related to noninferiority trials in cardiovascular medicine with a focus on the assessment of the noninferiority margin.Our findings imply that Authors, Reviewers and Editors should be aware of the risks of inappropriate reporting and interpretation of study results when dealing with designing, reporting, interpreting, and commenting on noninferiority trials, particularly if absolute metrics are used or in case of large discrepancy between observed and anticipated event rates in the control arm.
Noninferiority Margins in Cardiovascular Trials STUDY LIMITATIONS.We acknowledge some limitations of this analysis.First, the lower-thananticipated event rates may be influenced by confounders, including a higher-than-anticipated efficacy of the active control; however, we focused our analysis on the presence and impact of a lower-thananticipated event rate, regardless of its causal mechanism.Second, our recalculated noninferiority margin is modified in a data-driven way, therefore our method might be exposed to a certain grade of type I error risk inflation, and it cannot be recommended universally.To this purpose, Quartagno et al 35,36 proposed the concept of power-stabilizing noninferiority boundaries to handle unexpected event risk in the control group.Third, we did not assess the impact of study misinterpretations on regulatory approvals or guideline formulations, a complex area beyond the scope of this observational cohort study.Finally, while observer bias during fulltext assessments of peer-reviewed manuscripts cannot be entirely excluded for the secondary endpoint investigation, it was mitigated by using a standardized and reproducible mathematical calculation for the primary endpoint.

PERSPECTIVES COMPETENCY IN MEDICAL KNOWLEDGE
Noninferiority trials are becoming more common in cardiovascular medicine, but their reporting and interpretation can be difficult, particularly when using an absolute risk difference as the noninferiority margin.In cardiovascular medicine, absolute metrics are frequently preferred; however, using an absolute margin may carry some risk of regression towards noninferiority if the trial terminates with a lowerthan-anticipated event rate.The adoption of both absolute and relative noninferiority margins might be the safest solution to address this issue.
TRANSLATIONAL OUTLOOK: Among noninferiority trials in cardiovascular medicine, the majority adopted an ARD margin and almost 1 out of 4 presented divergent results after recalculating the noninferiority margin.Most trials did not include notes of cautionary interpretation in the conclusions section.
Our findings imply that Authors, Reviewers and Editors should be aware of the risks related to the choice of the noninferiority margin when dealing with designing, reporting, interpreting, and commenting on noninferiority trials, particularly if absolute metrics are used and in case of large discrepancy between observed and anticipated event rates in the control arm.

Greco et al
STUDY SELECTION.In this cross-sectional study, late breaking science sessions from 3 major international annual cardiology conferences (ie, American College of Cardiology, American Heart Association, and European Society of Cardiology) and a large interventional cardiology meeting (ie, Transcatheter Cardiovascular Therapeutics) were scrutinized for studies presented since January 2015 and simultaneously or successively published in extenso in a peer-reviewed medical Journal as of November 10, 2022.This process was aimed at identifying a sample of peer-reviewed and published trials characterized by ample mediatic exposure, likely representing the most cutting-edge research in the field and impacting on practice, with less emphasis on studies with lower potential for dissemination and publication in toptier journals.The search strategy is described in detail in the appendix (Supplemental Methods 1).The screening results were jointly reviewed by the authors to solve potential discrepancies.All studies had to meet the following inclusion criteria: 1) randomized trials with at least 1 primary analysis powered for noninferiority; and 2) parallel design.Follow-up reports of previously presented or published noninferiority trials, subgroup analyses of noninferiority trials and noninferiority trials with event-driven design were excluded.Studies that did not report information relevant to sample size calculation or determination of the noninferiority margin were also excluded.A B B R E V I A T I O N S A N D A C R O N Y M S ARD = absolute risk difference RRR = relative risk ratio Noninferiority Margins in Cardiovascular Trials control arm was generally lower than the anticipated event rate (À17.1%;IQR: À35.8% to 8.7%).The articles corresponding to the trial presentation were published in a total of 14 peer-reviewed Journals covering the fields of cardiovascular and general medicine, often simultaneously (56.8%) with the congress presentation.Trials that were not simultaneously published had a median time from presentation to publication of 235.0 days (IQR: 120.2-333.5 days).
as follows: 1) 3 noninferiority trials out of 4 used an ARD margin; 2) in the analyses using an ARD, almost 1 out of 4 had different results after recalculating the noninferiority margin; and 3) notes of cautionary interpretation were lacking in the conclusions of all the analyses that originally claimed CENTRAL ILLUSTRATION Assessment of Noninferiority Margins in Cardiovascular Medicine Trials Greco A, et al.JACC Adv.2024;3(7):101021.Distribution of noninferiority randomized controlled trials with relative or absolute margin over the eligible population of studies.Number of included analyses who have met noninferiority before (eg, according to authors) and after the recalculation of noninferiority margin; Number prevalence of analyses not including cautionary notes of interpretation among studies with divergent results after the noninferiority margin recalculation.*Indicates 1 single trial that did not claim noninferiority in the original analysis, but met noninferiority criteria after recalculation, therefore being excluded from the assessment of study conclusions.ARD ¼ absolute risk difference; NI ¼ noninferiority; RRR ¼ relative risk ratio.Greco et al J A C C : A D V A N C E S , V O L . 3 , N O .7 several pillars have been identified by experts and regulatory authorities: 1) determination of the noninferiority margin based on the results of previous placebo-controlled trials of the active control; 2) choice of a noninferiority margin scale (ie, absolute or relative); 3) selection of appropriate endpoints (eg, clinical relevance, availability of historical data); 4) assay sensitivity over placebo (ie, superiority of the active control to placebo); 5) trial conduct (eg, adequateness of treatment administration, endpoint adjudication); and 6) selection of data analysis (eg, intention-to-treat, per-protocol, as treated).18

FIGURE 2
FIGURE 2 Trials Showing Divergent Results After the Recalculation of Noninferiority Margin Randomized clinical trials are one of the most acknowledged and trusted sources of knowledge in current evidence-based medicine.A number of issues can impair the proper reporting of a trial, including endpoint selection, assay sensitivity over placebo, adequateness of trial conduct, and selection of a proper data analysis plan.An additional peculiar challenge when dealing with noninferiority trials is represented by the determination of the type and magnitude of the noninferiority margin.Recalculating noninferiority margins based on actual event rates led to different outcomes in 1 out of 4 cardiovascular trials, and most differing trials lacked cautionary interpretation.To minimize these risks, recommendations from regulatory authorities should be highly regarded when conceiving, designing, conducting, reporting, and interpreting a noninferiority trial.FUNDING SUPPORT AND AUTHOR DISCLOSURES Dr Capodanno has received honoraria from Novo Nordisk, Sanofi and Terumo, and Institutional fees from Medtronic.All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.ADDRESS FOR CORRESPONDENCE: Prof Davide Capodanno, Azienda Ospedaliero-Universitaria Policlinico "G.Rodolico -San Marco", University of Catania, Via Santa Sofia, 78, Catania, 95100, Italy.E-mail: dcapodanno@unict.it.R E F E R E N C E S 1. Collins R, MacMahon S. Reliable assessment of the effects of treatment on mortality and major morbidity, I: clinical trials.Lancet.2001;357:373-380.2. Zelen M. A new design for randomized clinical trials.N Engl J Med. 1979;300:1242-1245.

TABLE 1
Study Characteristics According to the Chosen Type of Noninferiority Margin The section 'Others' refers to trials not specifically focused on drug or device interventions.These include trials of therapeutic strategies, procedural approaches, and diagnostic or screening methodologies.b Corresponding relative margin for trials using absolute metrics.c Excluding simultaneously published trials.
32the noninferiority hypothesis, but noninferiority was re-established after recalculation, due to higherthan-anticipated event rates.32ASSESSMENTOF TRIAL INTERPRETATION. Alhe 10 analyses that originally claimed noninferiority but did not meet noninferiority after recalculation did not include cautionary notes to account for the chance of a differing interpretation (Central

TABLE 2
Characteristics of Trials Showing Divergent Results After the Recalculation of Noninferiority Margin Journal of the American Medical Association; NI ¼ noninferiority; NEJM ¼ New England Journal of Medicine; RD ¼ risk difference; UCL ¼ upper confidence limit.